Mediaprocessors Provide High Performance by Using Both Instruction - and Data - Level Parallelism . Because of the Increased Computing Power , Transferring Data between off - and on - Chip Memories without Slowing Down
نویسندگان
چکیده
0272-1732/01/$10.00 2001 IEEE Mediaprocessors are programmable processors specifically designed to provide flexible, cost-effective, high-performance computing platforms for multimedia computing. Multimedia streams carry enormous amounts of data, and efficiently handling that data is becoming increasingly more important in high-performance multimedia computing. Moreover, the dataand instruction-level parallelism in mediaprocessors shortens processing time, making more functions and applications I/O bound rather than compute bound. The growing speed disparity between the processor and off-chip memory further exasperates the problem of how to efficiently move data. Most general-purpose processors use data caches to transfer data between slower off-chip and faster on-chip memories. Some digital signal processors (DSPs), on the other hand, use direct memory access (DMA) controllers to move data between offand on-chip memories, more directly controlling memory access than the data cache mechanism. DMA’s direct control allows a predictable access time, which is critical in many real-time DSP applications. A double-buffering technique lets a DMA controller work independently of the core processing unit. Most mediaprocessors have either a data cache or a DMA controller, but not both, so programmers must use whichever is provided. However, some mediaprocessors support both data cache and DMA on a single chip— for example, the MAP1000 and the TMS320C64x. In this case, programmers must answer three questions:
منابع مشابه
Design and Implementation of an Audio Codec (AMR-WB) using Data Flow Programming Language CAL in the OpenDF Environment Design and Implementation of an Audio Codec (AMR-WB) using Data Flow Programming Language CAL in the OpenDF Environment
Over the last three decades, computer architects have been able to achieve an increase in performance for single processors by, e.g., increasing clock speed, introducing cache memories and using instruction level parallelism. However, because of power consumption and heat dissipation constraints, this trend is going to cease. In recent times, hardware engineers have instead moved to new chip ar...
متن کاملFor Embedded Applications with Data-level Parallelism, a Vector Processor Offers High Performance at Low Power Consumption and Low Design Complexity. unlike Superscalar and Vliw Designs, a Vector Processor Is Scalable and Can Optimally Match Specific
Designers of embedded processors have typically optimized for low power consumption and low design complexity to minimize cost. Performance was a secondary consideration. Nowadays, many embedded systems (set-top boxes, game consoles, personal digital assistants, and cell phones) commonly perform computation-intensive media tasks such as video processing, speech transcoding, graphics, and high-b...
متن کاملSmart Memories: A Configurable Processor Architecture for High Productivity Parallel Programming
With single processor systems running into instruction-level parallelism (ILP) limits and fundamental VLSI constraints, multiprocessor chips provide a realistic path towards scalable performance by allowing one to take advantage of thread-level (TLP) and data-level parallelism (DLP) in emerging applications. Nevertheless, parallel architectures are limited by the difficulty of parallel applicat...
متن کاملManaging Leakage Energy in Cache Hierarchies
Energy management is important for a spectrum of systems ranging from high-performance architectures to low-end mobile and embedded devices. With the increasing number of transistors, smaller feature sizes, lower supply and threshold voltages, the focus on energy optimization is shifting from dynamic to leakage energy. In fact, leakage energy is projected to become the dominant portion of the c...
متن کاملDesign and Implementation of a High Speed Systolic Serial Multiplier and Squarer for Long Unsigned Integer Using VHDL
A systolic serial multiplier for unsigned numbers is presented which operates without zero words inserted between successive data words, outputs the full product and has only one clock cycle latency. 
The multiplier is based on a modified serial/parallel scheme with two adjacent multiplier cells. Systolic concept is a well-known means of intensive computational task through replication of fu...
متن کامل